In this notebook, we introduce the idea of comparing the outputs of predictions to what really happened: that is, "scoring" predictions.
We have implemented a variety of prediction algorithms. Some, for example the SEPP methods, have an explicit statistical model of a random process, attempt to fit data to that process, and then issue a prediction based on that fitting. Others, for example the classic Retrospective or Prospective hot-spotting techniques, produce a "ranking" of different grid cells, but make no real attempt to give meaning to the actual values of "risk" produced.
A prediction might also be on a network, but except for topological issues, we can think of this as simply a prediction with "grid" replaced by "edge of a network".
In principle, many prediction methods produce a (at least piecewise) continuous function which is then integrated to produce a grid prediction. At present, we have no way to use a continuous prediction, and so we shall not develop a way to compare them.
Summary:
Simply put, some way of comparing a prediction to reality.
None of the prediction methods make any claim to predict that "a crime will definitely occur here" or that "a crime will not occur here". It is hence best to think of these predictions as probabilistic forecasts [2] (with the caveat above that some predicitons cannot be expect to produce true probability distributions). In Meteorology, Classification in machine learning etc. it is common to try to "predict" discrete events. There is a large lierature (e.g. the extremely readable [1]) here, but it does not seem directly relevant to our work. In particular, our meaning of "hit rate" is distinct from that used in comparing classification results.
Probabilistic forecasting of spatial data is considered in the field of Meteorology, see for example http://www.cawcr.gov.au/projects/verification/ As this site says:
In general, it is difficult to verify a single probabilistic forecast.
Instead, we compare a series of predictions against the series of actual events we are trying to predict.
In other notebooks we will study in detail:
Pick a coverage level, say 10%, select the top 10% of grid cells flagged, and then calculate the percentage of actual events captured by this 10%.
Convert the estimated probability density to percentiles (order from least to most risky, with most risky being set to 1.0 and least risky being set to 0.0). This is obviously related to picking a coverage level. Then evaluate the "percentile score" at each event location to obtain a series. Compute a summary statistic (e.g. mean) or compare this series against the series from another prediction.
Compute the normalised likelihood of the actual events for the probability distribution given by the prediction.
Perform KDE on the actual events to generate a probability distribution, and then compare this to the prediction.
Adapt the Brier score, and others, to our setting.
Treat the prediction as a Bayesian prior, and use ideas from information theory to see how closely the posterior matches.
In [ ]: